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SUMMARY 

The sequence of a 5'-proximal region of mRNA 5 of coronavirus MHV-JHM was 
determined by chain-terminator sequencing of cDNA subcloned in Ml3. The sequence 
contained two long open reading frames of 321 bases and 264 bases, overlapping by five 
bases but in different frames. Both open reading frames are initiated by AUG codons 
in sequence contexts that are relatively infrequently used as initiator codons. The 
smaller, downstream open reading frame encoded a neutral protein (mol. wt. 10200) 
with a hydrophobic amino terminus. The larger, 5'-proximal open reading frame 
encoded a basic protein (mol. wt. 12400) which lacks internal methionine residues. 
With the exception of the AUG codon initiating the downstream open reading frame, 
no internal AUG codons were found within the sequence covered by the upstream open 
reading frame. These results suggest that the MHV-JHM mRNA 5 is translated to 
produce two proteins by a mechanism involving internal initiation of protein synthesis. 
Preliminary evidence is presented showing that the downstream open reading frame is 
functional in vivo. 


INTRODUCTION 

Coronaviruses are pleomorphic, enveloped viruses which replicate in the cytoplasm of 
vertebrate cells. Their genome is a single-stranded, infectious RNA of mol. wt. about 6 x 10 6 . 
Their molecular biology has recently been reviewed by Siddell et al. (1983). The most widely 
studied member of the coronavirus group is murine hepatitis virus (MHV). MHV virions 
contain three structural proteins: peplomer (or E2), membrane (or El) and nucleocapsid 
protein. Infection by MHV results in the production of seven mRNA species in infected cells, 
representing the genomic RNA and six subgenomic mRNAs (mol. wt. from 06 x 10 6 to 
3*7 x 10 6 ). The mRNAs form a nested set with a common 3' terminus. Lai et aL (1983) showed 
that each mRNA possesses a common 5' leader, derived from the 5' end of the genomic RNA. 
Sequencing of mRNAs 6 and 7 (Armstrong et al. , 1984; Skinner & Siddell, 1983) and of an 
intergenic region of genomic RNA (Spaan et al ., 1983) showed that this leader is about 70 bases 
long. 

The translation in vitro of size-fractionated MHV mRNAs in cell-free systems or oocytes 
(Rottier et ai, 1981; Leibowitz et al ., 1982; Siddell, 1983) has shown that the major primary 
translation products of mRNAs 3, 6 and 7 are the polypeptide components of the virion 
peplomer, membrane and nucleocapsid proteins respectively. The size of the primary 
translation products (150K, 26K, 50K) and the size of the ‘unique’ sequences in the respective 
mRNAs (4*5 kb, 0-7 kb, 1*8 kb) suggests that the unique sequences encode and are translated 
into a single polypeptide. The translation in vitro of the genome-sized mRNA 1 to produce a 
series of related, approximately 200K polypeptides, which are thought to represent viral 
polymerase components (Leibowitz et al. , 1982), together with sequence analysis of mRNAs 6 
and 7 (Armstrong et ai , 1984; Skinner & Siddell, 1983) are also consistent with this idea. Only 
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mRN A 2 which translates in vitro to produce a 30K to 35K polypeptide but has a 'unique’ coding 
capacity of 80K is not fully consistent with such a model. 

The MHV subgenomic mRN As are produced in non-equimolar amounts in infected cells. 
Messenger RNAs 4 and 5 are minor species and have similar sizes. The translation in vitro of 
RNA fractions containing both mRN As has produced a 14K to 14-5K viral polypeptide 
(Leibowitz et al ., 1982; Siddell, 1983) but assignment to a specific mRNA has not been possible. 
In the accompanying paper (Skinner & Siddell, 1985), we present evidence that the RNA 
sequences comprising the unique region of mRNA 4 encode and are translated to produce this 
polypeptide. In this paper we present the sequence for the unique region of mRNA 5. 
Unexpectedly, our analysis suggests that this region, in contrast to the other MHV mRNAs 
(with the possible exception of mRNA 2), encodes two proteins. The organization of the coding 
sequences and the implications of these results are discussed. 

METHODS 

Materials. Avian myeloblastosis virus reverse transcriptase was obtained from Life Sciences (St Petersburg, 
Fla., U.S.A.). Synthetic oligonucleotides for cDNA synthesis, M13 sequencing (17-mer), M13 hybridization 
probes, as well as oligo(dG) t 2 - 18 were supplied by Pharmacia P-L Biochemicals. Escherichia coli DNA polymerase 
I, SI nuclease, terminal deoxynucleotidyl transferase and T4 polynucleotide kinase were obtained from Bethesda 
Research Laboratories. Lyophilized calf intestinal alkaline phosphatase was from Boehringer Mannheim. T4 
DNA ligase was from New England Nuclear. Restriction enzymes were from Pharmacia P-L Biochemicals, 
Bethesda Research Laboratories, Boehringer Mannheim and Renner (Dannstadt, F.R.G.). Radiochemicals were 
supplied by Amersham Buchler. 

Synthesis anti cloning oj double-stranded cDNA. Isolation of virus and of polyadenylated RNA from MHV- 
infected Sac( —) cells was performed as described previously (Siddell et al., 1980). Genomic RNA was isolated by 
phenol/chloroform extraction of purified virus. Single-stranded cDNA was prepared according to protocols 
described by Land et al. (1983) except that a synthetic primer (3'-ATTAGATTTGA-5', Pharmacia P-L 
Biochemicals) was used, at a concentration of 300 pg/ml, instead of oligo(dT). The primer is complementary to 
genomic and mRNA 7 sequences just upstream of the initiation site for translation of the nucleocapsid protein 
(Skinner & Siddell, 1983). Second-strand cDNA synthesis, cloning of double-stranded cDNA and characteriza¬ 
tion of cloned cDNA were as previously described (Skinner & Siddell, 1983), except that mapping ot restriction 
enzyme sites was not performed (sizes of restriction fragments were, however, determined). 

Nucleotide sequencing. Fragments of cDNA inserts were generated by a variety of restriction enzymes and were 
either cloned as a mixture or as single fragments (purified by electroelution from polyacrylamide gels) into the Ml3 
vectors mp8 and mp9 (Messing & Vieira, 1982). Fragments were then sequenced using the chain-terminator 
method of Sanger et al. (1977). Sequence data were analysed and assembled by the programs of Staden (1982). 

Direct sequencing o) RNA. A 56 bp fragment of DNA was isolated following cleavage of the MHV-A59-specific 
cDNA done (in pAGSl 125) with Alul (positions 297 to 352 in Fig. 3). The fragment (400 ng) was annealed to60pg 
of MHV-A59-infected cell RNA in 80% formamide, 40 mM-PIPES pH 6-4, 1 mM-EDTA, 0.4 M-NaCl at 37 °C for 
3 h (the melting temperature of the fragment in this buffer having been determined to be 32 to 34 °C). The 
annealed RNA and primer were precipitated in 0-3 M-sodium acetate, 70% ethanol and chain-terminator 
sequencing was performed with 10 pg of the annealed RNA and primer in 50 mM-Tris-HCl pH 8T, 50 mM-KCl, 
8 mM-MgCL, 1 mM-dithiothreitol and 1 unit RNase inhibitor (Amersham Buchler) per pi. Each 10 pi reaction 
contained 3 units reverse transcriptase, 20 pCi [a- 32 P)dATP (3000 Ci/mmol), and 7 pmol dATP. Dideoxy ATP 
was used at 0-2 pM, ddCTP, ddGTP and ddTTP were used at 2-5 pM, dCTP, dGTP and TTP were at 25 pM. After 
30 min at 42 °C, a chase was performed with 50 pM-dNTP for 30 min at 42 °C. Electrophoresis was as described by 
Sanger et al. (1977). 

Primer extension on infected cell mRNA. The same primer as used for cloning was dephosphorylated and 5 end- 
labelled with 3: P using protocols described by Maniatis et al. (1982). Two pmol of the primer and 6 pg poly(A) + 
RNA from MHV-A59-infected cells were heated at 95 °C for 3 min, frozen on dry ice and then thawed in 50 mM- 
Tris-HCl pH 8*3, 140 mM-KCl, 8 mM-MgCL, 4 mM-sodium pyrophosphate, 0-4 mM-dithiothreitol, i mM-dNTPs 
and 1 unit RNase inhibitor per pi. Reverse transcriptase (50 units) was added and the reaction was incubated at 
42 °C for 1 h when a further 50 units of reverse transcriptase was added and incubation was continued for another 
hour. Following phenol extraction and ethanol precipitation, a quarter of the sample was electrophoresed by 
alkaline agarose electrophoresis (McDonnell et al., 1977) in a vertical gel. The gel was neutralized in 7% TCA 
(30 min) and after drying was exposed to Fuji RX film (without screens). 

Northern hybridizations. Infected cell RNA was electrophoresed in 1% formaldehyde-agarose gels and was 
transferred onto nitrocellulose (Schleicher & Schiill) according to Maniatis et al. (1982). 
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M13 hybridization probes were made from sequenced Ml 3 clones of the appropriate polarity using the method 
of Hu & Messing (1982). The 56 bp Alul fragment described earlier was labelled at the 5' end using protocols 
described by Maniatis et al. (1982). Cloned mRNA 7 cDNA (Skinner & Siddell, 1983) was labelled by nick 
translation using the method of Rigby et al. (1977). 

Prehybridization was carried out in 50% formamide, 5 x SSPE, 5x Denhardt’s solution, 100 jig denatured 
salmon sperm DN A per ml, (M % SDS at 42 °C and hybridizations were performed in the same buffer except that 
it contained 1 x Denhardt’s solution. Filters were washed twice in 2 x SSPE, 0T % SDS and twice in 01 x SSPE, 
0-1% SDS, at room temperature. 

Labelling and electrophoresis of intracellular proteins. Procedures for the infection, labelling and preparation of 
total or cytoplasmic cell lysates of MHV-infected cells or mock-infected Sac( —) have been described previously 
(Siddell et al. , 1980, 1981). Samples were electrophoresed on 15% discontinuous SDS-polyacrylamide gels as 
described by Laemmli (1970). 

Translation in vitro of size-fractionated RNA. Cytoplasmic, polyadenylated RNA from MHV-JHM-infected 
cells was fractionated on sucrose-formamide gradients and was translated in an L-cell lysate as described 
previously (Siddell, 1983). 


RESULTS 

Identification of open reading frames 

MHV-JHM-specific cDNA was synthesized using intracellular polyadenylated RNA as a 
template. The largest cDNA clone isolated, in pJMSIOlO, hybridized against all the viral 
mRNAs except mRNA 7. Sequence determination and comparison with the MHV-A59 
sequence for mRNA 6 (Armstrong et al ., 1984; M. A. Skinner, unpublished) revealed that one 
end of the clone was positioned 297 bases upstream from the priming sequence, possibly due to 
incomplete second-strand synthesis. This position, 1983 bases from the 3' end of the genome, 
excluding the poly(A) tail, is designated a$ —1983. The beginning of the membrane protein El- 
coding sequence (encoded by mRNA 6) was identified at position —2370 and three large open 
reading frames (ORF) were found at positions -2633 to -2370 (ORF C), -2949 to -2629 
(ORF B) and — 3350 to — 2934 (ORF A). Upstream of ORF A is an ORF of 1160 bases (M. A. 
Skinner, unpublished) extending up to (and presumably beyond) the end of the currently 
sequenced DNA. Fig. 4(a) shows the general arrangement of these ORFs. 

As the sizes reported for MHV mRNAs 4 and 5 vary considerably (from 1*2 x 10 6 to 
T5 x 10 6 for mRNA 4 and from T08 x 10 6 tol-2 x 10 6 for mRNA 5; see review by Siddell et 
al. , 1982), we decided to map these ORFs to the mRNAs by hybridization analysis* A 56 base 
pair Alul restriction fragment from ORF B, an Ml3 hybridization probe from ORF A and an 
Ml 3 hybridization probe from a position upstream of ORF A (within the 1160 base ORF) were 
hybridized against a nitrocellulose filter to which viral mRNAs, fractionated by formaldehyde- 
agarose electrophoresis, had been transferred (for exact positions of these probes see Fig. 1). 
This analysis (Fig. 1) showed that the region 560 bases and more upstream of ORF A (and 
therefore within the 1160 base ORF) was located in the ‘unique’ sequence of mRNA 3. ORF A 
was located in the ‘unique’ region of mRNA 4 and ORF B (and, therefore, also ORF C) was 
within the ‘unique’ region of mRNA 5. 

We then used primer extension analysis to map the 5' ends of the mRNAs. The primer used 
for cloning was extended on infected cell RNA (Fig. 2) and extension products were assigned to 
the subgenomic mRNAs on the basis of the relative intensity of the stops (compared to the 
relative abundance of mRNAs in MHV-A59-infected cells, Fig. 1), the approximate sizes of the 
mRNAs and the hybridization data described above. The two strongest stops, at about 80 and 
800 bases, corresponded well to the known sizes of mRNA 7 and mRNA 6 (showing them to end 
at — 1755 and —2475, respectively). A clear, but fainter stop (at about 1400 bases, —3075) was 
assigned to mRNA 5. A number of minor extension products (about 1200 bases) were observed, 
but these corresponded well with a run of stops observed during direct chain-terminator 
sequencing of MHV-A59-specific intracellular RNA (see below and Fig. 3) and most likely do 
not represent mRNA termini. Finally, a clear and yet fainter stop (at about 1800 bases, — 3475) 
was assigned to mRNA 4. Allowing for a leader sequence of about 70 bases at the 5' end of each 
mRNA, this result suggests that the body of mRNA 4 begins at about position — 3400 and the 
body of mRNA 5 begins at about position — 3000. 
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Fig. 1. Hybridization ot M13 probes and an Alu\ restriction fragment to infected cell RNA separated 
by formaldehyde-agarose electrophoresis and transferred to nitrocellulose. (a to c) MHV-JHM-infected 
cell RNA; (d to /) MHV-A59-infected cell RNA. Probes used were: (a) Ml3 probe from 554 to 947 
bases upstream of ORF A; (b) M13 probe from position —3137 to —3060, within ORF A; (c, d) nick- 
translated mRNA 7 cDNA clone (Skinner & Siddell, 1983); (e) nick-translated MHV-A59 cDNA clone 
(pAGS1125) extending from the primer (—1676) to —2840 and therefore including the ORF for 
membrane protein El; (/) kinase-labelled Alu\ fragment ( — 2731 to —2676) from within ORF B. 
Numbers reter to mRNAs. X is a minor RNA species, the relative abundance of which varies from 
preparation to preparation. Us nature is currently under investigation. 

Fig. 2. Primer extension on infected cell RNA to map the 5' ends of subgenomic mRNAs. The 
synthetic primer used for cloning was extended on poly(A) + RNA isolated from cells infected with 
MH V-A59 as described in Methods. The products were analysed on a 1 % vertical alkaline agarose gel. 
(a) Primer extension; extension products assigned to the 5' ends of subgenomic mRNAs are indicated. 
P represents the position of the 11-mer primer. (/?) Labelled DNA markers (A EcoRl/HindlU digest); 
lengths in kilobases. 


5' non-coding sequences o f mRNA 5 

Upstream of the apparent coding region of mRNA 5 we have been able to identify the 
sequence GVUCUAAAC. This sequence is very similar to the sequence AAUCUAAAC which is 
found upstream of the mRNA 4-coding sequence (Skinner & Siddell, 1985). The latter sequence 
is identical to a sequence in the intergenic region upstream of the coding sequence of mRNA 7 
(Spaan et aL , 1983) and differs by only one base from a sequence upstream of the El-coding 
sequence of mRNA 6 (AAUCCAAAC\ M. A, Skinner, unpublished). It was postulated that such 
homologous sequences might be involved in regulating the initiation of synthesis of the bodies of 
MHV mRNAs (Armstrong et al , 1983; Spaan et al. , 1983). 
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1 TAGT TCTAAAC CTCATCTTAATTCrGGCCGTCCATACACACTTAGGCACTTGCCGAAGTA 60 

MetThrProProAlaThrTrp..lie- 

tIaTgIagaccaac ngccac atggxxatttggcatgtnagtgatgcntggttncgccgcncg 
6i tatgacacc accagctac(aT^gagatttggcttgtgagtgacgcctggctacgcc GAAC G 120 

MptGluIleTrpLeuValSerAspAlaTrpleuArgArgThr 


CGGGAC7T7GG7G7CNNTCGCCTNGAAGATNNNNNNNNNNNANNNNAT7ATAGCCAACCC 

121 CGAGACTTTGGTGTCACTCGACT7GAAGATTTTTGCTTCCAATTTAATTATTGCCAACCC 180 
ArgAspPheGJyVaJTfirArgLeuGJuAspPheCysPheGlnPheAsnTyrCysGJnPro 


CGAG7NNx 

^GTTATTGTAGAGTTCCTTTAAAGGCTTGGTGTAGCAACCAGGGTAAATTTGCA 

181 CGAGTTGGTTATTGTAGAGTTCCTTTAAAGGCTTGGTGTAGCAACCAGGGTAAATTTGCA 240 
ArgValGlyTyrCysArgValProLeuLysAlaTrpCysSerAsnGlnGlyLysPheAla 


GCGCAGTTTACCCTAAAAAGTTGCGAAAAACCAGGTCACGAAAAATTTATTACTAGCTTC 

241 GCGCAG777AC7C77AAAAG77GCGAAAAA7CAGGCCACCAAAAA77CATTAC7AGC77C 300 
AlaGlnPheThrLeuLysSerCysGluLysSerGlyHisGlnlysPhelleThrSerPhe 


ACGGCCTACGGCAGAACTGTCCAACAGGCCGTTAGCAAGTTAGTAGAAGAAGCTGTTGAT 

301 ACGGCCTACGCGAAAACAGTCAAACAGGCCGTTAGTAAGCTAGTAGAAGAAGCTGCTGAT 360 
ThrAIaTyrAIalysTPrVaILysGlnAJaVaJSerLysLeuValG]uGJ uAlaA]aAsp 


TTTATTGTTTTTAGGGCCACGCAGCTCGAAAGAAlM^TTffXAlTTTATTCTTTACAGACAC 

361 TTTATCATCTGGAGAGCCACGCAGCTCGAAAGAAlATGiT TtfAAlTTTATTCCTTACAGACAC 420 
PhellelleTrpArgAlaThrGlnLeuGluArgAsnValEncJ 

MetPheAsnLeuPheLeuThrAspT6 


AGTATGGTATGTGGGGCAGATTATTTTTATATTCGCAGTGTGTTTGATGGTCACCATAAT 

421 AGTATGGTATGTGGGGCAGATTATCTTTATAGTCGCAGTGTGTTTGATGGrCACCATAAT 480 
rValTrpTyrValGlyGlnllellePhelleValAlaValCysLeuMetValThrllell 


TGTGGTTGCCTTCCTTGCGTCTATCAAACTTTGTATTCAACTTTGCGGTTTATGTAATAC 

481 TGTGGTTGCCTTCCTTGCGTCTATTAAACGTTGTATTCAACTTTGCGGTTTATGTAATAC 540 
eValValAlaPheLeuAlaSerlleLysArgCysIleGlnleuCysGlyLeuCysAsnTh 


TTTGGTGCTGTCCCCrrCTArrrATTTGrATGArAGGAGTAAGCAGCTTTATAAGTATTA 

541 TTTGTTGCTGTCTCCCTCTATTTATCTGTATAATAGGAGTAAGCAGCTTTATAAGTATTA 600 
rLeuleuleuSerProSerlleTyrLeuTyrAsnArgSerLysGlnLeuTyrLysTyrTy 


-Asp.Ile End 

TAATGAAfiAAGTGAGACTGCCCCTATTAGAGGTGGATGATXArc|fAAlrCCAAACATTFfG] 

601 TAATGAAGAAGTGAGACC&CCCCCGTTAGAGGTGGATGATAATATAATCCAAACATTBiSL 660 
rAsnGluGluValArgProProProLeuGluValAspAbpAsnllelleGlnthrLeuEn 

Met 


AGTAGTACTACT 

661 ^GTAGTACCACT 672 
d 

S e r S e r T h r T h r 

Fig. 3. Sequence derived from the DN A clone representing the ‘unique', 5'-proximal coding sequences 
that are found in mRNA 5. The sequence is numbered arbitrarily from 1 (equivalent to 3027 bases from 
the 3' end of the genome) to 672. The numbered line is the sequence derived from MHV-JHM. The line 
above shows the MHV-A59 sequence as derived by sequencing a cDNA clone, pAGSl 125 (188 to 672), 
or by direct, chain-terminator sequencing of RNA (61 to 187). The complete deduced protein sequences 
of ORFs B and C of MHV-JHM are shown, as is part of the ORF coding for El membrane protein. The 
predicted amino-terminal sequence of the protein encoded by MHV-A59 ORF B and the predicted 
carboxyl-terminal sequence of the protein encoded by MHV-A59 ORF C are also shown, above the 
MHV-A59 sequence. N indicates bases not determined in the direct RNA sequencing and X indicates 
positions of deletions in MH V-A59. The underlined sequence indicates the homologous sequence found 
in genomic sequences upstream of the coding region of MHV mRNAs. AUG codons, initiating the 
translation of ORFs B and C in MHV-JHM and MHV-A59 and of the ORF encoding El protein, are 
boxed. The sequencing strategy is shown in Fig. 4(6). 


585 




586 


M. A. SKINNER, D. EBNER AND S. G. SIDDELL 


(a) 

-4 -3 -2 -I kb 

k - . » — -i -—i--.—- 1 3' 


JMS1010 
-- AGS1125 


mRNA 

7 

— 6 

5 

4 


E2 (?) 


B 


El 


N 


poly (A) 


ORF 


(b) 

A t.~ ~ 

ORF::— : — 3 B 



JHM 


1 —"■ 


—>- 


*- / J T— T- 


-2356 

T -4 J 


-3027 




A59 


—(ZZZZZZ2Z3 


Fig, 4. (a) General arrangement of ORFs, mRNAs and cDNA clones aligned with the genome of 
MHV, numbered from the 3' end. Clone JMS1010 is an MHV-JHM clone derived from intracellular, 
polyadenylated RNA. Clone AGS1I25 is an MHV-A59 clone derived from genome RNA. ( b ) 
Sequencing strategy for sequences shown in Fig. 3. Arrows indicate the direction and extent of 
sequencing of M13 subclones. The upper central line shows positions of restriction enzyme sites on the 
MHV-JHM cDNA clone. Scale marks above the line indicate each 100 bases. The numbers indicate the 
position of the sequence relative to the 3' end of the genome. The lower central line shows where the 
sites differ in the MHV-A59 clone. Arrows above these lines represent sequencing of MHV-JHM 
cDNA, while those below represent sequencing of MHV-A59cDN A. Although not all the MHV-JHM 
sequence was obtained from both strands, the corresponding region of the MHV-A59 sequence was. 
Open boxes show the positions of the ORFs. The hatched box shows the primer used for direct chain- 
terminator sequencing of MHV-A59 RNA. The dotted line extending from it illustrates the extent of 
direct RNA sequencing. Restriction enzyme sites used: #, Hae\W\ x, Alu\\ O, Rsal. 


Nature of the open reading frames within the ‘unique’ region of mRNA 5 

The sequences of ORF B and C, located within the unique region of mRNA 5, are shown in 
Fig. 3. At position 79 of the sequence (position — 2949 on the genome), the second AUG codon 
of mRNA 5 initiates a long ORF (B, 321 bases), capable of encoding a protein of mol. wt. 12400 
(107 residues). This ORF overlaps by five bases the start of the second, downstream ORF (C, 
position 395), which is in a different reading frame. The second ORF (264 bases) potentially 
encodes a protein of mol. wt. 10200 (88 residues). It in turn overlaps the start of the membrane 
protein El ORF (within the unique region of mRNA 6) by one base. 
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to the analysis of Kyte & Doolittle (1982). The vertical scale is the average hydropathy ( + 5 to — 5) for a 
frame of seven amino acids. The base line is at -0-49, the average hydropathy of the 20 amino acids. 
Hydrophobic sequences appear above the base line. Markers along the horizontal scale are at intervals 
of 25 amino acids, (a) Plot for the 124K product of the first ORF; (b) plot for the 102K product of the 
second, downstream ORF. 


The product of the upstream ORF (B) was predicted to be a basic protein whereas the product 
of the downstream ORF (C) would be a neutral hydrophobic protein. Hydropathy plots of both 
are shown in Fig. 5. The protein encoded by the downstream ORF would have a strongly 
hydrophobic region between amino acid residues 9 and 37. Conspicuously, the first ORF 
contains no AUG codons within the coding sequence, in or out of frame, except for the one that 
would function as initiator for the downstream ORF. 

Identification of mRNA 5 translation products 

In the accompanying paper we present evidence that the previously described I4/14-5K virus- 
specific polypeptide found in MHV-infected cells should be assigned to mRNA 4 (Skinner & 
Siddell, 1985). Therefore, no intracellular polypeptide(s) which could represent the primary 
translation product(s) of mRNA 5 have, to date, been identified. With the information provided 
by the sequence analysis we therefore decided to reexamine the polypeptides synthesized in 
MHV-infected cells. Fig. 6 shows the results of these experiments, performed for both MHV- 
JHM and MHV-A59. In both cases, an infection-specific polypeptide of 9K to 10K was 
detected, although more readily in MHV-A59-infected cells. The apparent mol. wt. of this 
polypeptide (Fig. 6 c) suggests that it might represent the product of the downstream ORF in the 
unique region of mRNA 5. A larger infection-specific polypeptide of 1 IK to 13K (which would 
be the predicted size of the translation product of the 5'-proximal ORF in mRNA 5) could not be 
identified. A polypeptide of this size was detected in MHV-A59-infected cells, even at late times 
of infection, but could not be discriminated by electrophoresis from a host cell polypeptide with 
a similar apparent mol. wt. 

The 9K to 10K infection-specific polypeptide could be detected in cell lysates that were 
prepared so as to minimize proteolytic degradation. However, to exclude the possibility that this 
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electrophoresis were performed as described previously (Siddell et ai, 1980, 1981), To minimize possible protein degradation, inhibitors of protein degradation were 
used in (c), as described in Siddell et al. (1981). The polypeptide described in the text is indicated by <J. M, Mock-infected. 
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mRNA 4/5 mRNA 6 mRNA 7 

▼ ▼ ▼ 



Fig. 7. In vitro translation of size-fractionated cytoplasmic, polyadenylated RNA from MHV-JHM- 
infected cells. The size of products is indicated as mol. wt. x 10~ 3 . The polypeptide referred to in the 
text is indicated by <3. 


polypeptide was a degradation product of a more abundant virus-specific protein we translated 
size-fractioned MHV-JHM mRNA in vitro . These experiments showed that the synthesis of the 
9K to 10K polypeptide corresponds to the abundance of mRNA 5 in the RNA fractions 
translated (Fig. 7). 

Comparison of the mRNA 5 unique sequences of MHV-JHM and A59 

To see if the ORFs found in the unique region of MHV-JHM mRNA 5 were conserved in 
other strains of MHV, we cloned MHV-A59 cDNA. A cDNA clone (in pAGSl 125) covering 
part of the unique region of mRNA 5 of MHV-A59 was isolated and sequenced (Fig. 3). This 
sequence shows that within the unique region of MHV-A59 mRNA 5 there is conservation of 
the downstream ORF (C) and conservation of the upstream ORF (B) within the extent of the 
clone (position 188). Most of the base changes that were found were conservative but a deleted A 
(position 641 of the MHV-JHM sequence) in the A59 sequence results in termination five 
residues earlier than in MHV-JHM mRNA 5. 

To evaluate the sequence homology between MHV-JHM and A59 in the region between the 
end of the cDNA clone and the 5' end of the body of MHV-A59 mRNA 5, we used direct 
dideoxy sequencing on MHV-A59-infected cell mRNA with an Alul fragment primer (positions 
297 to 352). Due to strong stops (particularly in the region 151 to 166), it was not possible to 
determine the complete nucleotide sequence by this method, but the presence of any frameshifts 
was clear. This analysis showed (Fig. 3) that, in this region, most of the base changes were also 
conservative but that two bases (AG) deleted from the MHV-A59 sequence (at positions 83 and 
84 of the JHM sequence) would result in initiation at the 5 / -terminal AUG codon in MHV-A59 
mRNA 5 and therefore extend the amino-terminus of the MHV-A59 protein by five amino acid 
residues. This conclusion must be considered tentative as it is possible that the bases at positions 
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151 to 166 could include termination codons. The conservative changes further downstream 
appear to make this possibility unlikely, but the question will only be resolved by sequence 
analysis of cDNA. 


DISCUSSION 

The results presented in this paper are relevant to the general replication strategy of 
coronaviruses and specifically to the translation product(s) of MHV mRNA 5. They may also, 
however, have wider implications regarding the translation of mRNAs. 

Most eukaryotic mRNAs initiate protein synthesis at a 5'-proximal AUG and in cases where 
an mRNA is structurally polycistronic (as for example, with the exception of mRNA 7, the 
coronavirus mRNAs) it appears that in general only the 5'-proximal cistron is translated, i.e, the 
mRNA is functionally monocistronic (Kozak, 1983). However, there are now many examples 
where a 5'-proximal, but not necessarily the 5 / -terminal, AUG codon in an mRNA is used to 
initiate protein synthesis. To account for this observation, Kozak (1983) has surveyed the 5'- 
terminal sequences of many eukaryotic mRNAs and proposed that there is an optimal sequence 
context around an AUG codon, for initiating translation. The majority of AUG codons that are 
positioned upstream from a functional AUG codon, but are themselves non-functional, are 
situated in a different sequence context which is therefore considered sub-optimal. It is 
supposed that the majority of ribosomes would bypass AUG codons in such sub-optimal contexts 
and would be free to scan the mRNA further downstream for an AUG codon in a more 
favourable context. However, AUG codons in sub-optimal contexts can also apparently initiate 
translation to different degrees and therefore in some cases initiation can occur at both upstream 
and downstream AUG codons, giving rise to two proteins which may or may not overlap or be in 
the same reading frame. Examples of this case have been reported for bunyaviruses (Bishop et 
al ., 1982), simian virus 40 (Jay et ai , 1981), adenoviruses (Bos et ai , 1981) and possibly 
reoviruses (Kozak, 1982; Cenatiempo et ai , 1984). In such cases it is clear that each upstream 
AUG, even in suboptimal contexts, potentially reduces the number of ribosomes which can 
continue scanning for downstream initiation codons. 

The sequence analysis presented here suggests that MHV-JHM mRNA 5 encodes two 
proteins within its unique sequence, and the sequence context of potential initiating AUG 
codons is consistent with this idea. The 5'-terminal AUG triplet in MHV-JHM mRNA 5 is 
found in a context YNNAUGY (where Y is a pyrimidine) which comprises only 0-5% of the 
functional initiators, but 44% of the upstream non-functional initiation codons surveyed by 
Kozak (1983). Furthermore, an in-phase termination codon occurs only 11 triplets downstream 
from this AUG. The AUG codons that initiate both of the long ORFs found in the unique region 
of MHV-JHM mRNA 5 fall within a context indicative of more commonly used initiators, i.e. 
GNNAUGY and YNNAUGG, but not the optimal context. Most conspicuously, with the 
exception of the AUG codon that initiates the downstream ORF, the upstream ORF is devoid 
for over 300 bases of internal AUG codons, either in or out of frame. Consequently, ribosomes 
that bypass the upstream ORF-initiating codon would not have an opportunity to initiate 
translation until the downstream ORF was reached. Our analysis of MHV-A59 mRNA 5 again 
revealed a similar sequence arrangement, although the situation is complicated by the two-base 
deletion (and consequent frameshift) at the 5' end of the upstream ORF. This would result in the 
5' terminal AUG codon being used as the initiating codon for this ORF and the second AUG 
codon (which is used as the initiating codon in the corresponding MHV-JHM ORF) would now 
produce only a six-amino acid product. As described above, the context in which the 5'-terminal 
AUG is located is very rarely used for initiating protein synthesis. Thus, relative to MHV-JHM 
mRNA 5 the level of translation of the product of the upstream ORF would be reduced for 
MHV-A59 mRNA 5. Consistent with this interpretation is our finding that MHV-JHM and 
MHV-A59 show dear differences in the ratio of mRNA 5 to the other viral mRNAs in infected 
cells (Fig. 1). Infection with MHV-A59 produces relatively much more mRNA 5, possibly as a 
compensatory mechanism to produce increased amounts of the upstream ORF product. As the 
sequence differences between MHV-JHM and MHV-A59 mRNA 5 should not reduce the 
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efficiency of translation of the downstream ORF, its product should therefore be relatively more 
abundant in MHV-A59-infected cells, a prediction which appears to be the case (Fig. 6). 

Boursnell & Brown (1984) have recently published a preliminary sequence of mRNA B, the 
second most abundant viral mRNA in avian infectious bronchitis virus-infected cells. This 
coronavirus mRNA was also found to contain two overlapping ORFs in the ‘unique’ region. In 
this case, the downstream ORF extends into the coding region of mRNA A, and overlaps the 
ORF encoding nucleocapsid protein by 55 bases. Thus, the arrangement of ORFs which we 
have elucidated for MHV mRNA 5 may be more widespread amongst the coronaviruses. 
Boursnell & Brown (1984) did not, however, provide any evidence relating to the in vivo or in vitro 
translation products of IBV mRNA B. 

In relation to the replication strategy of MHV, our data also indicate that the idea that all the 
subgenomic mRNAs are functionally monocistronic (Siddell et al., 1982) may require 
modification. It appears that at least mRNA 5 (and possibly one or more of the remaining 
mRNAs that have not yet been sequenced) may be functionally bi- or polycistronic. Our 
preliminary experiments on the synthesis of mRNA 5 translation products in infected cells were 
consistent with this conclusion but were not conclusive. It remains to be shown that the 9K to 
10K infection-specific polypeptide is indeed the translation product of the downstream ORF in 
mRNA 5. Also, using [ 35 S]methionine labelling we were unable to identify any putative product 
of the upstream ORF. We are currently using different radioactive amino acids and 
polyacrylamide gel systems with higher resolution to search for this product. A knowledge of the 
primary structure of the predicted products also now allows us to raise antisera against synthetic 
peptides which can be used to isolate and positively identify these polypeptides. 

Finally, our sequence analysis allows us to predict the nature of the polypeptides potentially 
encoded in MHV mRNA 5. The product encoded by the upstream ORF of MHV-JHM mRNA 
5 was predicted to be a basic protein (mol. wt. 12400) with no regions of high hydrophobicity. 
Between residues 30 and 85 the protein is basic with a charge of +8, compared with a charge of 
+ 6 for the molecule as a whole. Thus, it is possible to speculate that this central domain is 
involved in an interaction with RNA. The predicted product (mol. wt. 10200) of the 
downstream ORF has a very hydrophobic amino-terminal region, which is likely to be a 
membrane anchoring domain, and a neutral carboxy terminus. It is therefore unlikely to interact 
directly with RNA. Thus, if this protein was involved in, for example, the siting of a 
replication/transcription complex, it would have to do so by interaction with another protein 
component of the complex rather than by direct interaction with RNA. Expression of these 
products from the cDNA cloned in appropriate vectors would enable larger quantities of the 
proteins to be produced for further studies on their function. 

In a recent paper Boursnell et al. (1984) reported that homologous sequences exist in three 
infectious bronchitis virus mRNAs at the position where the bodies of the mRNAs commence. 
This sequence vftCUUAAC is very similar to the equivalent MHV sequence discussed in this 
paper ($\JCUAAAC). 
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